{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Practice: Orthogonality of Hermite Polynomials\n",
    "\n",
    "In this exercise, modify the Monte-Carlo example to demonstrate the orthonormality of the two [Hermite polynomials](https://en.wikipedia.org/wiki/Hermite_polynomials)\n",
    "\n",
    "* $1$ and\n",
    "* $x^2-1$\n",
    "\n",
    "with respect to the weight $e^{-\\frac{x^2}2}$, i.e. show (numerically, using a Monte Carlo method) that\n",
    "\n",
    "$$\n",
    "\\int_{-\\infty}^\\infty 1 \\cdot (x^2-1) \\cdot e^{-\\frac{x^2}2}dx = 0\n",
    "$$\n",
    "\n",
    "and that\n",
    "\n",
    "$$\n",
    "\\int_{-\\infty}^\\infty (x^2-1)^2  \\cdot e^{-\\frac{x^2}2}dx = 2\\sqrt{2\\pi}.\n",
    "$$\n",
    "\n",
    "Realize that\n",
    "$$\n",
    "\\int_{-\\infty}^\\infty \\dots  \\cdot \\frac{e^{-\\frac{x^2}2}}{\\sqrt{2\\pi}}dx\n",
    "$$\n",
    "can be evaluated by Monte-Carlo summation of $\\dots$ where the $x$ are normally distributed.\n",
    "\n",
    "Use the [Box-Muller transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) to obtain normally-distributed random numbers from the uniformly distributed ones returned by PyOpenCL's random number generator.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pyopencl as cl\n",
    "import pyopencl.array\n",
    "import pyopencl.clrandom\n",
    "\n",
    "ctx = cl.create_some_context()\n",
    "queue = cl.CommandQueue(ctx)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Boilerplate for Random Number Generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "generator_preamble = \"\"\"\n",
    "#include <pyopencl-random123/philox.cl>\n",
    "\n",
    "typedef union {\n",
    "    uint4 v;\n",
    "    philox4x32_ctr_t c;\n",
    "} philox4x32_ctr_vec_union;\n",
    "\n",
    "\n",
    "uint4 philox4x32_bump(uint4 ctr)\n",
    "{\n",
    "    if (++ctr.x == 0)\n",
    "        if (++ctr.y == 0)\n",
    "            ++ctr.z;\n",
    "    return ctr;\n",
    "}\n",
    "\n",
    "uint4 philox4x32_gen(\n",
    "        uint4 ctr,\n",
    "        uint2 key,\n",
    "        uint4 *new_ctr)\n",
    "{\n",
    "    philox4x32_ctr_vec_union result;\n",
    "    result.c = philox4x32(\n",
    "        *(philox4x32_ctr_t *) &ctr,\n",
    "        *(philox4x32_key_t *) &key);\n",
    "    *new_ctr = philox4x32_bump(ctr);\n",
    "    return result.v;\n",
    "}\n",
    "\n",
    "float4 philox4x32_f32(\n",
    "        uint4 ctr,\n",
    "        uint2 key,\n",
    "        uint4 *new_ctr)\n",
    "{\n",
    "    *new_ctr = ctr;\n",
    "    return\n",
    "        convert_float4(philox4x32_gen(*new_ctr, key, new_ctr))\n",
    "        * 2.3283064365386963e-10f;\n",
    "}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Monte-Carlo code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#clear\n",
    "\n",
    "from mako.template import Template\n",
    "\n",
    "mc_preamble_src = Template(\"\"\"\n",
    "\n",
    "#include <pyopencl-complex.h>\n",
    "\n",
    "float compute_sample(int i, unsigned int k1)\n",
    "{\n",
    "    uint4 ctr = { 0, 1, 2, 3 };\n",
    "    uint2 key2 = { i, k1 };\n",
    "    float4 rng_res = philox4x32_f32(ctr, key2, &(ctr));\n",
    "    \n",
    "    float r0 = sqrt(-2*log(rng_res.s0));\n",
    "    float v0 = r0*cos((float) (2*M_PI) * rng_res.s1);\n",
    "    float v1 = r0*sin((float) (2*M_PI) * rng_res.s1);\n",
    "\n",
    "    float r2 = sqrt(-2*log(rng_res.s2));\n",
    "    float v2 = r2*cos((float) (2*M_PI) * rng_res.s3);\n",
    "    float v3 = r2*sin((float) (2*M_PI) * rng_res.s3);\n",
    "    \n",
    "    float result = 0;\n",
    "    \n",
    "    %for x in [\"v0\", \"v1\", \"v2\", \"v3\"]:\n",
    "    {\n",
    "        float x = ${x};\n",
    "        float H2 = x*x - 1;\n",
    "        result += H2;\n",
    "    }\n",
    "    %endfor\n",
    "    \n",
    "    return result;\n",
    "}\n",
    "\"\"\", strict_undefined=True).render()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#clear\n",
    "\n",
    "from pyopencl.reduction import ReductionKernel\n",
    "\n",
    "rknl = ReductionKernel(ctx, np.float32,\n",
    "        neutral=\"0\",\n",
    "        reduce_expr=\"a+b\", map_expr=\"compute_sample(i, k1)\",\n",
    "        arguments=\"unsigned int k1\", preamble=generator_preamble+mc_preamble_src)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.000107572192383\n"
     ]
    }
   ],
   "source": [
    "#clear\n",
    "n = 10000000\n",
    "\n",
    "nsamples = 4*n\n",
    "result = rknl(15, range=slice(n), queue=queue).get() / nsamples\n",
    "\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1+"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
